Search CORE

136 research outputs found

Analysis of 1276 Haplotype-Resolved Genomes Allows Characterization of Cis- and Trans-Abundant Genes

Author: Herwig R.
Hoehe M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/11/2022
Field of study

Many methods for haplotyping have materialized, but their application on a significant scale has been rare to date. Here we summarize analyses that were carried out in 1092 genomes from the 1000 Genomes Consortium and validated in an unprecedented number of 184 PGP genomes that have been experimentally haplotype-resolved by application of the Long-Fragment Read (LFR) technology. These analyses provided first insights into the diplotypic nature of human genomes and its potential functional implications. Thus, protein-changing variants were not randomly distributed between the two homologues of 18,121 autosomal protein-coding genes but occurred significantly more frequently in cis than in trans configurations in virtually each of the 1276 phased genomes. This resulted in global cis/trans ratios of ~60:40, establishing “cis abundance” as a universal characteristic of diploid human genomes. This phenomenon was based on two different classes of genes, a larger one exhibiting cis configurations of protein-changing variants in excess, so-called “cis-abundant” genes, and a smaller one of “trans-abundant” genes. These two gene classes, which together constitute a common diplotypic exome, were further functionally distinguished by means of gene ontology (GO) and pathway enrichment analysis. Moreover, they were distinguishable in terms of their effects on the human interactome, where they constitute distinct cis and trans modules, as shown with network propagation on a large integrated protein–protein interaction network. These analyses, recently performed with updated database and analysis tools, further consolidated the characterization of cis- and trans-abundant genes while expanding previous results. In this chapter, we present the key results along with the materials and methods to motivate readers to investigate these findings independently and gain further insights into the diplotypic nature of genes and genomes

MPG.PuRe

Hum Hered

Author: Hoehe M.
Vingron M.
Zhang J.
Publication venue: 'S. Karger AG'
Publication date: 01/01/2005
Field of study

The inference of haplotype pairs directly from unphased genotype data is a key step in the analysis of genetic variation in relation to disease and pharmacogenetically relevant traits. Most popular methods such as Phase and PL do require either the coalescence assumption or the assumption of linkage between the single-nucleotide polymorphisms (SNPs). We have now developed novel approaches that are independent of these assumptions. First, we introduce a new optimization criterion in combination with a block-wise evolutionary Monte Carlo algorithm. Based on this criterion, the 'haplotype likelihood', we develop two kinds of estimators, the maximum haplotype-likelihood (MHL) estimator and its empirical Bayesian (EB) version. Using both real and simulated data sets, we demonstrate that our proposed estimators allow substantial improvements over both the expectation-maximization (EM) algorithm and Clark's procedure in terms of capacity/scalability and error rate. Thus, hundreds and more ambiguous loci and potentially very large sample sizes can be processed. Moreover, applying our proposed EB estimator can result in significant reductions of error rate in the case of unlinked or only weakly linked SNPs

MPG.PuRe

On haplotype reconstruction for diploid populations

Author: Hoehe M.R.
Vingron M.
Zhang J.
Publication venue: Eurandom
Publication date: 01/01/2001
Field of study

Repository TU/e

Pure OAI Repository

Significant abundance of cis configurations of coding variants in diploid human genomes

Author: Church G.
Drmanac R.
Herwig R.
Hoehe M.
Huebsch T.
Mao Q.
Peters B.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/04/2019
Field of study

To fully understand human genetic variation and its functional consequences, the specific distribution of variants between the two chromosomal homologues of genes must be known. The 'phase' of variants can significantly impact gene function and phenotype. To assess patterns of phase at large scale, we have analyzed 18 121 autosomal genes in 1092 statistically phased genomes from the 1000 Genomes Project and 184 experimentally phased genomes from the Personal Genome Project. Here we show that genes with cis-configurations of coding variants are more frequent than genes with trans-configurations in a genome, with global cis/trans ratios of ∼60:40. Significant cis-abundance was observed in virtually all genomes in all populations. Moreover, we identified a large group of genes exhibiting cis-configurations of protein-changing variants in excess, so-called 'cis-abundant genes', and a smaller group of 'trans-abundant genes'. These two gene categories were functionally distinguishable, and exhibited strikingly different distributional patterns of protein-changing variants. Underlying these phenomena was a shared set of phase-sensitive genes of importance for adaptation and evolution. This work establishes common patterns of phase as key characteristics of diploid human exomes and provides evidence for their functional significance, highlighting the importance of phase for the interpretation of protein-coding genetic variation and gene function

MPG.PuRe

Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes

Author: Church G.
Hoehe M.
Huebsch T.
Kroslak T.
Lehrach H.
Nowick K.
Palczewski S.
Schulz S.
Suk E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2014
Field of study

To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set of 14 haplotype-resolved genomes generated by fosmid clone-based sequencing, complemented and expanded by up to 372 statistically resolved genomes from the 1000 Genomes Project. We find immense diversity of both haploid and diploid gene forms, up to 4.1 and 3.9 million corresponding to 249 and 235 per gene on average. Less than 15% of autosomal genes have a predominant form. We describe a ‘common diplotypic proteome’, a set of 4,269 genes encoding two different proteins in over 30% of genomes. We show moreover an abundance of cis configurations of mutations in the 386 genomes with an average cis/trans ratio of 60:40, and distinguishable classes of cis- versus trans-abundant genes. This work identifies key features characterizing the diplotypic nature of human genomes and provides a conceptual and analytical framework, rich resources and novel hypotheses on the functional importance of diploidy

PubMed Central

MPG.PuRe

Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes

Author: Church George M.
Hoehe Margret R.
Huebsch Thomas
Kroslak Thomas
Lehrach Hans
Nowick Katja
Palczewski Stefanie
Schulz Sabrina
Suk Eun-Kyung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/01/2015
Field of study

CiteSeerX

Harvard University - DASH

Association of alpha1a-adrenergic receptor polymorphism and blood pressure phenotypes in the Brazilian population

Author: A Snapir
Alexandre C Pereira
AV Chobanian
B Lei
C Gu
CA Mochtar
CN Rotimi
DG Rokosh
DGD Gu
HG Xie
I Lessa
J He
JB Buckwalter
José E Krieger
José G Mill
JW Hsu
K McKenzie
K Shibata
M Iacoviello
M Ohyanagi
Marcilene S Floriano
MRBWH Hoehe
RP Lifton
S Jiang
SA Miller
Silvia R Freitas
SP Rao
WB Kannel
XL Rudner
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: The alpha1A-adrenergic receptor (alpha(1A)-AR) regulates the cardiac and peripheral vascular system through sympathetic activation. Due to its important role in the regulation of vascular tone and blood pressure, we aimed to investigate the association between the Arg347Cys polymorphism in the alpha(1A)-AR gene and blood pressure phenotypes, in a large sample of Brazilians from an urban population. Methods: A total of 1568 individuals were randomly selected from the general population of the Vitoria City metropolitan area. Genetic analysis of the Arg347Cys polymorphism was conducted by polymerase chain reaction/restriction fragment length polymorphism. We have compared cardiovascular risk variables and genotypes using ANOVA, and Chi-square test for univariate comparisons and logistic regression for multivariate comparisons. Results: Association analysis indicated a significant difference between genotype groups with respect to diastolic blood pressure (p = 0.04), but not systolic blood pressure (p = 0.12). In addition, presence of the Cys/Cys genotype was marginally associated with hypertension in our population (p = 0.06). Significant interaction effects were observed between the studied genetic variant, age and physical activity. Presence of the Cys/Cys genotype was associated with hypertension only in individuals with regular physical activity (odds ratio = 1.86; p = 0.03) or younger than 45 years (odds ratio = 1.27; p = 0.04). Conclusion: Physical activity and age may potentially play a role by disclosing the effects of the Cys allele on blood pressure. According to our data it is possible that the Arg347Cys polymorphism can be used as a biomarker to disease risk in a selected group of individuals.FAPESP (Fundacao de Amparo a Pesquisa do Estado de Sao Paulo)[2001/03454-5

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Springer - Publisher Connector

PubMed Central

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Model order selection for bio-molecular data clustering

Author: A Alizadeh
A Alizadeh
A Alizadeh
A Ben-Hur
A Bertoni
A Jain
Alberto Bertoni
D Achlioptas
E Bingham
E Levine
G Valentini
G Valentini
Giorgio Valentini
H Cramer
J Freund
J Handl
J McQueen
J Ward
L Kaufman
L McShane
M Hoehe
M Kerr
M Shipp
M Smolkin
N Bolshakova
N Garge
N Kaplan
R Tibshirani
S Ben-David
S Datta
S Dudoit
S Monti
T Golub
T Ho
T Lange
W Johnson
X Fern
Y Bilu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Cluster analysis has been widely applied for investigating structure in bio-molecular data. A drawback of most clustering algorithms is that they cannot automatically detect the ”natural ” number of clusters underlying the data, and in many cases we have no enough ”a priori ” biological knowledge to evaluate both the number of clusters as well as their validity. Recently several methods based on the concept of stability have been proposed to estimate the ”optimal ” number of clusters, but despite their successful application to the analysis of complex bio-molecular data, the assessment of the statistical significance of the discovered clustering solutions and the detection of multiple structures simultaneously present in high-dimensional bio-molecular data are still major problems. Results: We propose a stability method based on randomized maps that exploits the high-dimensionality and relatively low cardinality that characterize bio-molecular data, by selecting subsets of randomized linear combinations of the input variables, and by using stability indices based on the overall distribution of similarity measures between multiple pairs of clusterings performed on the randomly projected data. A χ 2-based statistical test is proposed to assess the significance of the clustering solutions and to detect significant and if possible multi-level structures simultaneously present in the data (e.g. hierarchical structures)

CiteSeerX

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

Common Genetic Variants of the Human Steroid 21-Hydroxylase Gene (CYP21A2) Are Related to Differences in Circulating Hormone Levels

Author: A Patocs
A Szilagyi
Andrzej T. Slominski
Attila Patócs
D Posada
DA Calhoun
DA Vassiliadi
DF Conrad
DJ Balding
Edit Gláz
F Faul
F Mantero
George Füst
GJ Arason
GR Abecasis
H Lefebvre
Henriette Farkas
J Kramer
JA Szabo
JC Barrett
JH Bauer
Julianna Anna Szabó
Júlia Pázmándi
K Racz
Klára Koncz
KT Weber
Károly Rácz
L Barzon
L Excoffier
M Ehrhart-Bornstein
M Sereg
M Stephens
M Stephens
M Toth
MA Zeiger
MI New
Miklós Tóth
MR Hoehe
Márta Korbonits
Márton Doleschall
P Librado
PC White
PF Koppens
PF Koppens
Péter Igaz
RJ Auchus
SM Baumgartner-Parzer
WF Young Jr
WL Miller
Y Higashi
Z Banlaki
Z Banlaki
Z Banlaki
Z Yang
Zoltán Prohászka
Ágnes Szilágyi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Hungarian Scientific Research Fund (OTKA, PD100648 (AP)) Technology Innovation Fund, National Developmental Agency (KTIA-AIK-2012-12-1-0010). AP is the recipient of a “Lendület” grant from the Hungarian Academy of Sciences

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Repository of the Academy's Library

Queen Mary Research Online

Semmelweis Repository

FigShare